Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not close FDs 0, 1, or 2 #186

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

DemiMarie
Copy link
Contributor

If they are closed, another file descriptor could be created with these numbers, and so standard library functions that use them might write to an unwanted place. dup2() a file descriptor to /dev/null over them instead.

Copy link

codecov bot commented Jan 9, 2025

Codecov Report

Attention: Patch coverage is 54.26357% with 59 lines in your changes missing coverage. Please review.

Project coverage is 78.96%. Comparing base (63e0699) to head (61a0261).

Files with missing lines Patch % Lines
agent/qrexec-agent.c 55.10% 44 Missing ⚠️
libqrexec/exec.c 53.33% 7 Missing ⚠️
agent/qrexec-exec-program.c 50.00% 6 Missing ⚠️
agent/qrexec-fork-server.c 0.00% 1 Missing ⚠️
daemon/qrexec-daemon.c 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #186      +/-   ##
==========================================
+ Coverage   78.84%   78.96%   +0.11%     
==========================================
  Files          55       56       +1     
  Lines       10146    10183      +37     
==========================================
+ Hits         8000     8041      +41     
+ Misses       2146     2142       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@DemiMarie
Copy link
Contributor Author

Codecov appears to not be testing what happens in the child process after fork() and the error path of “cannot open /dev/null”.

@marmarek
Copy link
Member

AFAIR unit tests do not cover the PAM handling part, as they are not running as a system service, test runners don't have necessary PAM configuration etc.

@qubesos-bot
Copy link

qubesos-bot commented Jan 11, 2025

OpenQA test summary

Complete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025021620-4.3&flavor=pull-requests

Test run included the following:

New failures, excluding unstable

Compared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025020404-4.3&flavor=update

  • system_tests_whonix@hw7

    • whonixcheck: fail (unknown)
      Whonixcheck for sys-whonix failed...

    • whonixcheck: unnamed test (unknown)

  • system_tests_suspend@hw1

    • suspend: unnamed test (unknown)
    • suspend: Failed (test died)
      # Test died: no candidate needle with tag(s) 'SUSPEND-FAILED' match...
  • system_tests_gui_tools@hw7

    • desktop_linux_manager_create_qube: unnamed test (unknown)

    • desktop_linux_manager_create_qube: Failed (test died)
      # Test died: no candidate needle with tag(s) 'new-qube-select-name'...

    • desktop_linux_manager_create_qube: unnamed test (unknown)

  • system_tests_whonix

    • whonixcheck: fail (unknown)
      Whonixcheck for sys-whonix failed...

    • whonixcheck: unnamed test (unknown)

  • system_tests_suspend

    • suspend: unnamed test (unknown)
    • suspend: Failed (test died)
      # Test died: no candidate needle with tag(s) 'SUSPEND-FAILED' match...
  • system_tests_extra

    • TC_00_QVCTest_whonix-workstation-17: test_010_screenshare (failure)
      ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^... AssertionError: 0 == 0
  • system_tests_qrexec

  • system_tests_misc

    • TC_06_AppVM_debian-12-xfce: test_020_custom_persist (failure)
      AssertionError: Too much / too little files persisted: b''b"stat: c...

    • TC_06_AppVM_debian-12-xfce: test_110_rescue_console (failure)
      AssertionError: Calling whoami failed, but emergency console started

    • TC_06_AppVM_debian-12-xfce: test_111_rescue_console_initrd (failure)
      AssertionError: Emergency mode not found

    • TC_06_AppVM_fedora-41-xfce: test_020_custom_persist (failure)
      AssertionError: Too much / too little files persisted: b''b"stat: c...

    • TC_06_AppVM_fedora-41-xfce: test_110_rescue_console (failure)
      AssertionError: Calling whoami failed, but emergency console started

    • TC_06_AppVM_whonix-gateway-17: test_020_custom_persist (failure)
      AssertionError: Too much / too little files persisted: b' File: /h...

    • TC_06_AppVM_whonix-gateway-17: test_110_rescue_console (failure)
      AssertionError: Calling whoami failed, but emergency console started

    • TC_06_AppVM_whonix-gateway-17: test_111_rescue_console_initrd (error)
      qubes.exc.QubesVMError: Cannot connect to qrexec agent for 120 seco...

    • TC_06_AppVM_whonix-workstation-17: test_020_custom_persist (failure)
      AssertionError: Too much / too little files persisted: b' File: /h...

    • TC_06_AppVM_whonix-workstation-17: test_110_rescue_console (failure)
      AssertionError: Calling whoami failed, but emergency console started

    • TC_06_AppVM_whonix-workstation-17: test_111_rescue_console_initrd (error)
      qubes.exc.QubesVMError: Cannot connect to qrexec agent for 120 seco...

  • system_tests_gui_tools

    • desktop_linux_manager_config: unnamed test (unknown)
    • desktop_linux_manager_config: Failed (test died)
      # Test died: no candidate needle with tag(s) 'qubes-global-config-t...

Failed tests

28 failures
  • system_tests_whonix@hw7

    • whonixcheck: fail (unknown)
      Whonixcheck for sys-whonix failed...

    • whonixcheck: unnamed test (unknown)

  • system_tests_suspend@hw1

    • suspend: unnamed test (unknown)
    • suspend: Failed (test died)
      # Test died: no candidate needle with tag(s) 'SUSPEND-FAILED' match...
  • system_tests_gui_tools@hw7

    • desktop_linux_manager_create_qube: unnamed test (unknown)

    • desktop_linux_manager_create_qube: Failed (test died)
      # Test died: no candidate needle with tag(s) 'new-qube-select-name'...

    • desktop_linux_manager_create_qube: unnamed test (unknown)

  • system_tests_whonix

    • whonixcheck: fail (unknown)
      Whonixcheck for sys-whonix failed...

    • whonixcheck: unnamed test (unknown)

  • system_tests_suspend

    • suspend: unnamed test (unknown)
    • suspend: Failed (test died)
      # Test died: no candidate needle with tag(s) 'SUSPEND-FAILED' match...
  • system_tests_extra

    • TC_00_QVCTest_whonix-workstation-17: test_010_screenshare (failure)
      ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^... AssertionError: 0 == 0
  • system_tests_qrexec

  • system_tests_kde_gui_interactive

    • clipboard_and_web: unnamed test (unknown)
    • clipboard_and_web: Failed (test died)
      # Test died: no candidate needle with tag(s) 'clipboard-paste-notif...
  • system_tests_misc

    • TC_06_AppVM_debian-12-xfce: test_020_custom_persist (failure)
      AssertionError: Too much / too little files persisted: b''b"stat: c...

    • TC_06_AppVM_debian-12-xfce: test_110_rescue_console (failure)
      AssertionError: Calling whoami failed, but emergency console started

    • TC_06_AppVM_debian-12-xfce: test_111_rescue_console_initrd (failure)
      AssertionError: Emergency mode not found

    • TC_06_AppVM_fedora-41-xfce: test_020_custom_persist (failure)
      AssertionError: Too much / too little files persisted: b''b"stat: c...

    • TC_06_AppVM_fedora-41-xfce: test_110_rescue_console (failure)
      AssertionError: Calling whoami failed, but emergency console started

    • TC_06_AppVM_whonix-gateway-17: test_020_custom_persist (failure)
      AssertionError: Too much / too little files persisted: b' File: /h...

    • TC_06_AppVM_whonix-gateway-17: test_110_rescue_console (failure)
      AssertionError: Calling whoami failed, but emergency console started

    • TC_06_AppVM_whonix-gateway-17: test_111_rescue_console_initrd (error)
      qubes.exc.QubesVMError: Cannot connect to qrexec agent for 120 seco...

    • TC_06_AppVM_whonix-workstation-17: test_020_custom_persist (failure)
      AssertionError: Too much / too little files persisted: b' File: /h...

    • TC_06_AppVM_whonix-workstation-17: test_110_rescue_console (failure)
      AssertionError: Calling whoami failed, but emergency console started

    • TC_06_AppVM_whonix-workstation-17: test_111_rescue_console_initrd (error)
      qubes.exc.QubesVMError: Cannot connect to qrexec agent for 120 seco...

  • system_tests_gui_tools

    • desktop_linux_manager_config: unnamed test (unknown)
    • desktop_linux_manager_config: Failed (test died)
      # Test died: no candidate needle with tag(s) 'qubes-global-config-t...

Fixed failures

Compared to: https://openqa.qubes-os.org/tests/127852#dependencies

29 fixed
  • system_tests_qrexec_perf@hw1

    • TC_00_QrexecPerf_debian-12-xfce: test_110_simple_data_duplex (failure)
      AssertionError: '/usr/lib/qubes/tests/qrexec_perf.py --vm1=test-ins...
  • system_tests_storage_perf@hw1

    • integ: storage_perf (error)
      ModuleNotFoundError: No module named 'qubes.tests.integ.storage_perf'
  • system_tests_suspend

    • mount_and_boot_options: unnamed test (unknown)
    • mount_and_boot_options: Failed (test died)
      # Test died: no candidate needle with tag(s) 'x11' matched...
  • system_tests_backup

    • TC_10_BackupVM_whonix-gateway-17: test_110_send_to_vm_no_space (error)
      subprocess.CalledProcessError: Command 'mknod /dev/loop0 b 7 0;trun...

    • TC_10_BackupVM_whonix-workstation-17: test_110_send_to_vm_no_space (error)
      subprocess.CalledProcessError: Command 'mknod /dev/loop0 b 7 0;trun...

  • system_tests_qrexec

  • system_tests_dispvm

    • TC_20_DispVM_fedora-41-xfce: test_100_open_in_dispvm (failure)
      AssertionError: './open-file test.txt' failed with ./open-file test...
  • system_tests_devices

    • TC_00_List_whonix-gateway-17: test_000_list_loop (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-gateway-17: test_001_list_loop_mounted (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-gateway-17: test_010_list_dm (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-gateway-17: test_011_list_dm_mounted (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-gateway-17: test_012_list_dm_delayed (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-gateway-17: test_013_list_dm_removed (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-gateway-17: test_020_list_loop_partition (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-gateway-17: test_021_list_loop_partition_mounted (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-workstation-17: test_000_list_loop (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-workstation-17: test_001_list_loop_mounted (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-workstation-17: test_010_list_dm (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-workstation-17: test_011_list_dm_mounted (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-workstation-17: test_012_list_dm_delayed (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-workstation-17: test_013_list_dm_removed (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-workstation-17: test_020_list_loop_partition (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_00_List_whonix-workstation-17: test_021_list_loop_partition_mounted (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_10_Attach_whonix-gateway-17: test_000_attach_reattach (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

    • TC_10_Attach_whonix-workstation-17: test_000_attach_reattach (error)
      subprocess.CalledProcessError: Command 'set -e;truncate -s 128M /tm...

  • system_tests_audio

  • system_tests_basic_vm_qrexec_gui_ext4

    • switch_pool: Failed (test died)
      # Test died: command 'printf "label: gpt\n,,L" | sfdisk /dev/sdb' f...

Unstable tests

Performance Tests

Performance degradation:

No issues

Remaining performance tests:

72 tests
  • debian-12-xfce_exec: 7.26
  • debian-12-xfce_exec-root: 27.20
  • debian-12-xfce_socket: 8.29
  • debian-12-xfce_socket-root: 8.38
  • debian-12-xfce_exec-data-simplex: 45.46
  • debian-12-xfce_exec-data-duplex: 48.27
  • debian-12-xfce_exec-data-duplex-root: 65.16
  • debian-12-xfce_socket-data-duplex: 79.93
  • fedora-41-xfce_exec: 9.17 🟢 ( previous job: 21.67, improvement: 42.33%)
  • fedora-41-xfce_exec-root: 60.36 🟢 ( previous job: 74.48, improvement: 81.04%)
  • fedora-41-xfce_socket: 8.97 🟢 ( previous job: 21.26, improvement: 42.18%)
  • fedora-41-xfce_socket-root: 8.55 🟢 ( previous job: 20.89, improvement: 40.95%)
  • fedora-41-xfce_exec-data-simplex: 48.83 🟢 ( previous job: 58.71, improvement: 83.17%)
  • fedora-41-xfce_exec-data-duplex: 49.41 🟢 ( previous job: 60.49, improvement: 81.68%)
  • fedora-41-xfce_exec-data-duplex-root: 83.73 🟢 ( previous job: 92.13, improvement: 90.89%)
  • fedora-41-xfce_socket-data-duplex: 78.88 🟢 ( previous job: 83.57, improvement: 94.39%)
  • whonix-gateway-17_exec: 7.52
  • whonix-gateway-17_exec-root: 37.55
  • whonix-gateway-17_socket: 7.73
  • whonix-gateway-17_socket-root: 7.64
  • whonix-gateway-17_exec-data-simplex: 46.03
  • whonix-gateway-17_exec-data-duplex: 46.90
  • whonix-gateway-17_exec-data-duplex-root: 70.52
  • whonix-gateway-17_socket-data-duplex: 83.04
  • whonix-workstation-17_exec: 8.25
  • whonix-workstation-17_exec-root: 52.70
  • whonix-workstation-17_socket: 8.53
  • whonix-workstation-17_socket-root: 8.14
  • whonix-workstation-17_exec-data-simplex: 46.39
  • whonix-workstation-17_exec-data-duplex: 48.34
  • whonix-workstation-17_exec-data-duplex-root: 78.38
  • whonix-workstation-17_socket-data-duplex: 83.34
  • dom0_root_seq1m_q8t1_read 3:read_bandwidth_kb: 481439.00
  • dom0_root_seq1m_q8t1_write 3:write_bandwidth_kb: 144485.00
  • dom0_root_seq1m_q1t1_read 3:read_bandwidth_kb: 413476.00
  • dom0_root_seq1m_q1t1_write 3:write_bandwidth_kb: 202236.00
  • dom0_root_rnd4k_q32t1_read 3:read_bandwidth_kb: 89678.00
  • dom0_root_rnd4k_q32t1_write 3:write_bandwidth_kb: 5239.00
  • dom0_root_rnd4k_q1t1_read 3:read_bandwidth_kb: 10882.00
  • dom0_root_rnd4k_q1t1_write 3:write_bandwidth_kb: 1092.00
  • dom0_varlibqubes_seq1m_q8t1_read 3:read_bandwidth_kb: 490906.00
  • dom0_varlibqubes_seq1m_q8t1_write 3:write_bandwidth_kb: 136773.00
  • dom0_varlibqubes_seq1m_q1t1_read 3:read_bandwidth_kb: 432759.00
  • dom0_varlibqubes_seq1m_q1t1_write 3:write_bandwidth_kb: 201941.00
  • dom0_varlibqubes_rnd4k_q32t1_read 3:read_bandwidth_kb: 98972.00
  • dom0_varlibqubes_rnd4k_q32t1_write 3:write_bandwidth_kb: 5938.00
  • dom0_varlibqubes_rnd4k_q1t1_read 3:read_bandwidth_kb: 7670.00
  • dom0_varlibqubes_rnd4k_q1t1_write 3:write_bandwidth_kb: 3045.00
  • fedora-41-xfce_root_seq1m_q8t1_read 3:read_bandwidth_kb: 350929.00
  • fedora-41-xfce_root_seq1m_q8t1_write 3:write_bandwidth_kb: 269764.00
  • fedora-41-xfce_root_seq1m_q1t1_read 3:read_bandwidth_kb: 294048.00
  • fedora-41-xfce_root_seq1m_q1t1_write 3:write_bandwidth_kb: 65822.00
  • fedora-41-xfce_root_rnd4k_q32t1_read 3:read_bandwidth_kb: 87250.00
  • fedora-41-xfce_root_rnd4k_q32t1_write 3:write_bandwidth_kb: 1585.00
  • fedora-41-xfce_root_rnd4k_q1t1_read 3:read_bandwidth_kb: 8014.00
  • fedora-41-xfce_root_rnd4k_q1t1_write 3:write_bandwidth_kb: 1099.00
  • fedora-41-xfce_private_seq1m_q8t1_read 3:read_bandwidth_kb: 396137.00
  • fedora-41-xfce_private_seq1m_q8t1_write 3:write_bandwidth_kb: 108284.00
  • fedora-41-xfce_private_seq1m_q1t1_read 3:read_bandwidth_kb: 269349.00
  • fedora-41-xfce_private_seq1m_q1t1_write 3:write_bandwidth_kb: 167192.00
  • fedora-41-xfce_private_rnd4k_q32t1_read 3:read_bandwidth_kb: 82037.00
  • fedora-41-xfce_private_rnd4k_q32t1_write 3:write_bandwidth_kb: 1915.00
  • fedora-41-xfce_private_rnd4k_q1t1_read 3:read_bandwidth_kb: 8579.00
  • fedora-41-xfce_private_rnd4k_q1t1_write 3:write_bandwidth_kb: 1637.00
  • fedora-41-xfce_volatile_seq1m_q8t1_read 3:read_bandwidth_kb: 419598.00
  • fedora-41-xfce_volatile_seq1m_q8t1_write 3:write_bandwidth_kb: 175595.00
  • fedora-41-xfce_volatile_seq1m_q1t1_read 3:read_bandwidth_kb: 267289.00
  • fedora-41-xfce_volatile_seq1m_q1t1_write 3:write_bandwidth_kb: 97143.00
  • fedora-41-xfce_volatile_rnd4k_q32t1_read 3:read_bandwidth_kb: 76431.00
  • fedora-41-xfce_volatile_rnd4k_q32t1_write 3:write_bandwidth_kb: 3651.00
  • fedora-41-xfce_volatile_rnd4k_q1t1_read 3:read_bandwidth_kb: 9135.00
  • fedora-41-xfce_volatile_rnd4k_q1t1_write 3:write_bandwidth_kb: 1252.00

@marmarek
Copy link
Member

system_tests_qrexec

* TC_00_Qrexec_debian-12-xfce: [test_053_qrexec_vm_service_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_debian-12-xfce/4) (failure)
  `AssertionError: Timeout, probably EOF wasn't transferred`

* TC_00_Qrexec_debian-12-xfce: [test_092_qrexec_service_socket_dom0_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_debian-12-xfce/18) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred fr...`

* TC_00_Qrexec_debian-12-xfce: [test_098_qrexec_service_socket_vm_eof](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_debian-12-xfce/23) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred to...`

* TC_00_Qrexec_fedora-41-xfce: [test_053_qrexec_vm_service_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_fedora-41-xfce/4) (failure)
  `AssertionError: Timeout, probably EOF wasn't transferred`

* TC_00_Qrexec_fedora-41-xfce: [test_092_qrexec_service_socket_dom0_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_fedora-41-xfce/18) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred fr...`

* TC_00_Qrexec_fedora-41-xfce: [test_098_qrexec_service_socket_vm_eof](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_fedora-41-xfce/23) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred to...`

* TC_00_Qrexec_whonix-gateway-17: [test_053_qrexec_vm_service_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-gateway-17/4) (failure)
  `AssertionError: Timeout, probably EOF wasn't transferred`

* TC_00_Qrexec_whonix-gateway-17: [test_083_qrexec_service_argument_specific_implementation](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-gateway-17/14) (error)
  `subprocess.CalledProcessError: Command '/usr/lib/qubes/qrexec-clien...`

* TC_00_Qrexec_whonix-gateway-17: [test_092_qrexec_service_socket_dom0_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-gateway-17/18) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred fr...`

* TC_00_Qrexec_whonix-gateway-17: [test_098_qrexec_service_socket_vm_eof](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-gateway-17/23) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred to...`

* TC_00_Qrexec_whonix-workstation-17: [test_053_qrexec_vm_service_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-workstation-17/4) (failure)
  `AssertionError: Timeout, probably EOF wasn't transferred`

* TC_00_Qrexec_whonix-workstation-17: [test_092_qrexec_service_socket_dom0_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-workstation-17/18) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred fr...`

* TC_00_Qrexec_whonix-workstation-17: [test_098_qrexec_service_socket_vm_eof](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-workstation-17/23) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred to...`

This is the only qrexec PR in this test run, so the above failures seems to be regression caused by this change.

@@ -162,6 +162,11 @@ void buffer_append(struct buffer *b, const char *data, int len);
void buffer_remove(struct buffer *b, int len);
int buffer_len(struct buffer *b);
void *buffer_data(struct buffer *b);
/* Open /dev/null and keep it from being closed before the exec func is called.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it simpler (and safer) to simply open /dev/null just before dup-ing it over 0,1,2 (in the child process already)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s definitely simpler, but I didn’t want the extra call to open(). That said, giving each child process the same open file description is less than great, as it introduces shared state that should not be there. I don’t know if /dev/null has any such state, but still it isn’t great, so I went ahead and switched to your approach.

@marmarek
Copy link
Member

But also, I question usefulness of this PR as a whole - the closing of standard FDs happens in a process that has a single purpose - wait for the child process and then exit, in the very same function as closing happens. There are few PAM cleanup calls, but it's very unlikely for them to be a problem (especially, it isn't a problem now, or for the last 10 or so years).

@DemiMarie
Copy link
Contributor Author

Whether or not the last commit in the PR is merged, I definitely think the other commits should be merged. In particular, it turned out that the “close the FD” functionality had no unit tests because the unit tests took a codepath that was too different from the production code. This PR makes the production and test code follow the same path, with the result that the actual bug (/dev/null FD being closed by fix_fds()) is now caught. I think that this test improvement (and the other bug fixes) is itself useful.

There are few PAM cleanup calls, but it's very unlikely for them to be a problem (especially, it isn't a problem now, or for the last 10 or so years).

PAM cleanup calls into PAM modules, so it can do anything. I suspect Qubes OS only gets away with it because we have a fairly simple PAM stack by default. PAM cleanup is used for e.g. unmounting filesystems and closing encrypted volumes.

The best approach would be for PAM to run with stdin pointed at /dev/null and stdout and stderr pointed at the system log. The FDs would be fixed directly before executing the child process. That’s a bigger refactor, though.

@marmarek
Copy link
Member

marmarek commented Jan 12, 2025

Whether or not the last commit in the PR is merged,

Indeed I was talking about the last commit (which until the last force-push was the only commit in this PR).

PAM cleanup calls into PAM modules, so it can do anything. I suspect Qubes OS only gets away with it because we have a fairly simple PAM stack by default.

Aren't PAM modules expected handle proper logging themselves? I don't think they are supposed to touch calling process's stdin/out/err in any case. And if they would do, that likely would interfere also with cases where they aren't closed (and then replaced with with unrelated thing) - for example it could interfere with an application log file on stderr that is expected in a specific format (different than PAM messages).

@marmarek
Copy link
Member

As for the other commits, won't that have some non-trivial conflicts with #141 (which I hope is quite close to merge-able state)?

@DemiMarie
Copy link
Contributor Author

As for the other commits, won't that have some non-trivial conflicts with #141 (which I hope is quite close to merge-able state)?

I can include them in #141 or rebase this PR on top of it. I can also close this PR if you prefer, but I’d prefer that at least the bug fixes and testability changes go in.

@marmarek
Copy link
Member

A lot of tests has failed, see openqa report from the bot

@DemiMarie
Copy link
Contributor Author

A lot of tests has failed, see openqa report from the bot

I’ll work on fixing this.

@DemiMarie
Copy link
Contributor Author

I did some tests locally and it looks like the existing code already has a problem (one of the tests fails intermittently when I run it in a loop).

@marmarek
Copy link
Member

Maybe there are some other problems too, but clearly several tests reliably fail with this PR, while they reliably pass without it. Do you want access to a VM with this PR installed?

@DemiMarie
Copy link
Contributor Author

Maybe there are some other problems too, but clearly several tests reliably fail with this PR, while they reliably pass without it. Do you want access to a VM with this PR installed?

I already have one, and I can indeed reproduce the bug. It appears that at least some tests fail more often with this PR than without it, but they fail occasionally even without the PR, as shown by a while sudo qubes.tests.run TEST; do :; done loop terminating fairly quickly. This makes me suspect that there is a pre-existing bug (probably a race condition) in either qrexec or the tests, and this code happens to make the race condition more likely to trigger.

The way I found this is that I tried to bisect the bug. I found 9b2e8fc (v4.3.3 + bug fixes) was bad if I ran one test in a loop, which in turn led me to try 63e0699 (v4.3.3 itself) which also failed.

@marmarek
Copy link
Member

Recent run with db5faa9 has only one failure (1 failed out of 4 times this test was run)

I tried to get more logs, but it isn't very interesting:

2025-02-16 07:58:16.551 qrexec-fork-server[1054]: qrexec-agent-data.c:274:handle_new_process_common: executed: QUBESRPC test.Argument+argument test-inst-vm1 (pid 1056)
2025-02-16 07:58:16.553 qrexec-fork-server[1054]: qrexec-agent-data.c:309:handle_new_process_common: pid 1056 exited with 125

I don't see actual reason for the failure here...

These should never happen, but call exit() if they do.  Also avoid
freeing an uninitialized PAM handle in such an error case.

I do not consider this a security vulnerability because there is no
reasonable way I know of for an attacker to trigger this failure, but
this commit should still be backported.
Saves an (admittedly cheap) system call.
No functional change intended.
This will be used by tests later.

No functional change intended.
This will be used by tests later.  No functional change intended.
This also fixes a bug: basename can mutate its argument, so a copy must
be passed to it.
This makes the unit test code more like the actual code used by
end-users, and therefore makes the tests more accurate.

This trips a bug in the code which will be fixed later, requiring a test
to be changed to compensate.
If they are closed, another file descriptor could be created with these
numbers, and so standard library functions that use them might write to
an unwanted place.  dup2() a file descriptor to /dev/null over them
instead.

Also statically initialize trigger_fd to -1, which is the conventional
value for an invalid file descriptor.

This requires care to avoid closing the file descriptor to /dev/null in
fix_fds(), which took over an hour to debug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants